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SUMMARY 


Adapt ivft  or  "ieurning"  systems  can  automatically  modify  their  own 
structures  to  optimize  performance  based  on  past  experiences.  The  system 
designer  "teaches"  by  shoving  the  system  examples  of  i..put  signals  or  pat¬ 
terns  and  simultaneously  vhat  he  would  like  the  output  to  be  for  each  in¬ 
put.  The  system  in  turn  organizes  itself  to  comply  as  well  as  possible 
vit.h  the  wishes  of  the  designer. 

An  adaptive  pattern  classification  machine  (called  "Adaline",  for 
adaptive  linear)  has  been  devised  to  illustrate  adaptive  behavior  and 
artificial  learning.  During  a  training  phase,  crude  geometric  patterns 
arc  fed  to  the  machine  by  setting  the  toggle  switches  in  a  Ux1*  input  array. 
Setting  another  toggle  switch  tells  the  machine  whether  the  desired  output 
for  the  particular  input  pattern  is  +1  or  -1.  All  input  patterns  are  clas¬ 
sified  into  two  categories.  The  system  learns  a  little  from  each  pattern 
and  accordingly  experiences  a  design  change.  After  training,  the  machine 
can  be  used  to  classify  the  original  patterns  and  noisy  (distorted)  ver¬ 
sions  of  these  patterns. 

A  statistical  theory  has  been  developed  which  relates  the  competence 
of  the  classifier  to  the  amount  of  experience  had  (number  of  patterns 
"seen"  in  adapting).  Imperfect  system  adjustment  results  from  small - 
sample-size  experience.  The  misadjustment,  a  dimensionless  quantitative 
measure  of  the  quality  of  adaption,  is  defined  as  the  ratio  of  the  increase 
in  probability  of  error  of  a  system  adapted  to  a  small  number  of  patterns 
to  the  probability  of  error  of  a  "best -adapted"  system  (adapted  to  an 
arbitrarily  large  number  of  patterns).  Treating  the  classifier  as  a 
roughly  quantized  sampled -data  syBtem,  a  statistical  theory  of  adaption 
developed  for  adaptive  sampled-data  systems  has  been  utilized  to  derive 
a  formula  fur  misadjustment. 


The  number  of  input  lines  is  (n  +  l),  and  the  number  of  patterns  seen  in 
adapting  is  N.  This  formula  leads  to  a  basic  "rule  of  thumb"  for  adaptive 
classifiers:  The  number  of  patterns  required  to  train  an  adaptive  classi¬ 
fier  is  equal  to  several  times  the  number  of  bits  per  pattern.  This 
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rule  applies  without  regard  to  patterns  and  noise  characteristics.  Ex¬ 
perimental  '-vidence  Is  presented. 

The  pattern  classifier  Is  actually  an  adaptive  switching  circuit 
having  a  set  of  binary  Inputs  and  a  binary  output.  The  signal  on  each 
Input  line  is  either  4-1  or  -1  according  to  the  setting  of  the  Individual 
pattern  switch.  The  sixteen  input  signals  are  linearly  combined  and 
then  quantized.  The  weights  (which  could  be  positive  or  negative)  are 
determined  by  an  array  ol'  potentiometer  settings. 

Iterative  gradient  methods  are  used  during  the  training  phase  to 
find  the  potentiometer  settings  that  minimize  the  number  of  classifica¬ 
tion.  errors.  A  simple  procedure  has  been  devised  which  does  not  require 
actual  measurement  of  gradient,  and  which  guarantees  convergence  and 
permits  control  of  rate  of  convergence.  Adallne  can  usually  adapt  after 
seeing  ten  to  twenty  patterns  and  can  easily  distinqulsh  a  dozen  differ¬ 
ent  basic  patterns. 

As  a  generic  form  of  switching  functions,  Adallne  is  not  completely 
general.  All-possible-potentloneter-settlngs  allows  the  realization  of 
the  "linearly  separated  truth  functions",  a  subclass  of  all  switching 
functions.  Although  this  subclass  Is  restricted,  it  1b  a  useful  class, 
and,  most  important,  it  is  a  searchable  class  (the  best  within  the  class 
can  be  found  without  trying  all  possibilities).  Networks  of  Adelines 
overcome  this  restriction  and  are  far  more  general,  yet  present  adaption 
problems  of  no  greater  difficulty  than  those  of  single  Adelines. 

At  present  the  purely  mechanical  adaption  process  is  accomplished 
by  manual  potentiometer-setting.  A  means  of  automating  this  is  being 
developed  which  makes  use  of  multi-aperture  ferromagnetic  devices. 
Solid-state  adaptive  logical  elements  will  result  that  should  ultimately 
be  cultable  to  be  microminiaturized.  Networks  of  such  elements  would  be 
very  effective  in  pattern  recognition  systems,  information  storage  and 
retrieval-by-classification  systems,  and  self-repairing  logical  and 
computing  systems. 
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I .  INTRODUCTION 


The  modern  science  of  switching  theory  began  with  work  by  Shannon"- 

in  1938*  The  field  has  developed  rapidly  since  then,  and  at  present  a 

2 

wealth  of  literature  exists  concerning  the  analysis  and  synthesis  of 
logical  networks  which  might  range  from  simple  interlock  systems  to 
tclephor.  switching  systems  to  large-scale  digital  computing  systems. 

A  key  idea  in  switching  theory  is  that  the  performance  requirements 
of  any  logical  system  can  be  completely  specified  by  a  boolean  function 
expressing  output  conditions  in  terms  of  input  conditions,  and  that  the 
algebraic  symbols  in  the  boolean  function  are  readily  identifiable  with 
simple  storage  and  "ana"  -  "or"  elements.  The  problem  of  simplification 
of  networks  for  most  economic  realization  is  reduced  to  a  problem  of 
algebraic  simplification  of  boolean  functions,  a  task  which  is  more 
easily  accomplished  by  human  designers  than  reduction-by-inspection  of 
logical  networks  themselves. 

An  example  illustrating  the  use  of  switching  theory  is  that  of  the 
design  of  an  interlock  system  for  the  control  of  traffic  in  a  railroad 
switch  yard.  The  first  step  is  the  preparation  of  a  "truth  table",  an 
exhaustive  listing  of  ail  input  possibilities  (the  positions  of  all  in¬ 
coming  and  outgoing  trains),  and  what  the  desired  system  output  should 
be  (what  the  desired  control  signals  should  be)  for  each  input  situation. 
The  next  step  is  the  construction  of  the  boolean  function,  and  the 
following  steps  are  algebraic  reduction  and  design  of  the  logical  control 
system. 

Tne  design  of  a  traffic  control  system  is  an  example  wherein  the 
truth  table  must  be  followed  precisely  and  reliably.  Errors  would  be 
destructive. 

The  design  of  the  arithmetic  element  of  a  digital,  computer  is  another 
example  wherein  the  truth  table  must  be  followed  precisely.  There  are 
other  situations  in  which  some  errors  are  inevitable,  however,  and  here 
errors  are  usually  costly  but  not  catastrophic.  These  situations  caul 
for  statistically  optimum  switching  circuits.  A  common  performance 
objective  is  the  minimization  of  the  average  number  of  errors.  An  ex¬ 
ample  if  that  of  prediction  of  the  next  bit  in  a  correlated  stochastic 
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binary  number  sequence.  The  predictor  output  is  to  be  a  logical  combin¬ 
ation  of  »  finite  number  of  previouE  input  nequeneo  bits.  An  optimum 
system  Is  a  sequential  switching  circuit  that  predicts  with  a  minimum 
number  of  errors. 

Suppose  that  a  record  of  the  binary  sequence  Is  printed  on  tape  and 
cut  up  into  pieces  (with  indication  of  the  positive  direction  of  time* 
preserved),  say  d'o  bits  long.  Place  all  pieces  whore  the  most  recent 
event  is  ONE  in  one  pile  and  the  remainder  in  another  pile.  Delote  the 
most  recent  bit  on  each  piece  of  tape.  If  the  statistical  scheme  could 
be  discovered  by  which  the  pieces  of  tape  are  classified,  this  would  lead 
to  a  prediction  scheme.  It  is  apparent  that  prediction  is  a  certain  Kind 
of  classification. 

Assuming  statistical  regularity,  a  reasonable  way  to  proceed  might 
be  to  form  a  truth  table,  and  let  the  data  from  each  piece  of  tape  be  an 
entry  in  the  table.  It  might  be  expected  that  with  the  data  of  100  pieces 

of  tape,  a  fairly  good  predictor  could  be  developed.  The  truth  table 
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would  have  only  IOC  entries  however,  out  of  a  total  of  2  .  Hie  "best" 

way  to  fill  in  the  remainder  of  the  truth  table  depends  upon  the  nature 
of  the  sequence  statistics  and  the  error  cost  criteria.  Filling  in  the 
table  is  a  difficult  and  a  crucial  part  of  the  problem.  Even  if  the 
truth  table  were  filled  in,  however,  the  designer  would  have  the  diffi¬ 
cult  teak  of  realizing  a  logical  network  to  satisfy  a  truth  table  with 
22*1  entries. 

An  approach  to  such  problems  is  taken  in  this  paper  which  does  not 
require  an  explicit  use  of  the  truth  table.  Ihe  design  objective  is  the 
minimization  of  the  average  number  of  errors,  rather  than  a  minimization 
of  the  number  of  logical  components  used.  The  nature  of  the  logical 
elements  is  quite  unconventional.  The  system  design  procedure  is  adaptive, 
and  is  based  upon  an  iterative  search  process.  Performance  feedback  is 
used  to  achieve  automatic  system  synthesis,  i.e.,  the  selection  of  the 
"best"  system  from  a  restricted  but  useful  class  of  possibilities.  The 
designer  "trains"  the  system  to  give  the  correct  responses  by  "shoving" 
it  examples  of  inputs  and  respective  desired  outputs.  The  more  examples 
"seen",  the  better  is  the  system  performance.  System  competence  will  be 
directly  and  quantitatively  related  to  amount  of  experience. 
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II.  A  NEURON  ELEMENT 


In  Fig.  1,  a  combinatorial  logical  circuit  is  shown  which  is  a 
typical  element  in  the  adaptive  switching  circuits  to  be  considered. 

This  element  bears  some  resemblance  to  a  "neuron"  model  Introduced  by 
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von  Neuman  ,  whence  the  name. 


Input 

lines 


FIG.  1 . •  •  AN  ADJUSTABLE  NEMRON. 

The  binary  input  signals  on  the  individual  lines  have  values  of  +1 
or  -1,  rather  them  the  usual  values  of  1  or  0.  Within  the  neuron,  a 
linear  combination  of  the  input  signals  is  formed.  The  weights  are  the 
gains  a^,  a^,  . ..,  which  could  have  both  positive  and  negative  values . 
The  output  signed  is  +1  if  this  weighted  sum  is  greater  than  a  certain 
threshold,  and  >1  otherwise.  The  threshold  level  is  determined  by  the 
setting  of  aQ,  whose  input  is  permemently  connected  to  a  +1  source. 
Varying  aQ  varies  a  constant  added  to  the  linear  combination  of  input 
Bignals. 

s 

For  fixed  gain  settings,  each  of  the  2  possible  input  combinations 
would  cause  either  a  +1  or  -1  output.  Thus,  all  possible  inputs  are 
classified  into  two  categories.  The  input. output  relationship  is  deter¬ 
mined  by  choice  of  the  gains  aQ,  ...a,..  In  the  adaptive  neuron,  these 
gains  are  set  during  the  "training"  procedure. 
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In  general,  there  are  2^  different  input-output  relationships  or 

truth  functions  by  vl.ici  the  five  input  variable*:  can  be  mapped  into  the 

single  output  variable.  Only  a  subset  of  these,  the  linearly  separated 
4 

truth  functions  ,  can  bo  realized  by  all  possible  choices  of  the  gaine 
of  the  neuron  of  Fig.  1.  Although  this  subset  is  not  all-inclusive*, 

It  is  a  useful  subset,  and  it  is  "searchable",  i.e.,  the  "best"  function 
in  many  practical  cases  can  be  found  iteratively  without  trying  all 
functions  within  the  subset. 

Application  of  this  neuron  in  adaptive  pattern  classifiers  was  first 
made  by  Mattson.  He  has  shown  that  complete  generality  in  choice  of 
switching  function  could  be  had  by  combining  these  neurons.  He  devised 
an  iterative  digital  computer  routine  for  finding  the  best  set  of  a's 
for  the  classification  of  noisy  geometric  patterns.  An  iteiuoive  procedure 
having  similar  objectives  has  been  devised  by  these  authors  and  is  des¬ 
cribed  in  the  next  section.  The  latter  procedure  is  quite  simple  to 
implement,  and  can  be  analyzed  by  statistical  methods  that  have  already 
been  developed  for  the  analysis  of  adaptive  sampled  data  systems. 

III.  AN  ADAPTIVE  PATTERN  CLASSIFIER 

An  adaptive  pattern  classification  machine  (called  "Adaline",  for 
adaptive  linear)  has  been  constructed  for  the  purpose  of  illustrating 
adaptive  behavior  and  artificial  learning.  A  photograph  of  this  machine, 
which  is  about  the  size  of  a  lunch  pail,  is  shown  in  Fig.  2. 

During  &  training  phase,  crude  geometric  patterns  are  fed  to  the 
machine  by  setting  the  toggle  switches  in  the  4x4  input  switch  array. 
Setting  another  toggle  switch  (the  reference  switch)  tells  the  machine 
whether  the  desired  output  for  the  particular  input  pattern  is  +1  or  -1. 

The  system  icarnc  a  little  from  each  pattern  and  accordingly  experiences 
a  design  change.  The  machine's  total  experience  is  stored  in  the  values 
of  the  weights  a^.-a^g.  The  machine  can  be  trained  on  undlstorted 
noiBe-free  patterns  by  repeating  them  over  and  over  until  the  iterative 
search  process  converges,  or  it  can  be  trained  on  a  sequence  of  noisy 

* 

It  becomes  &  vanishingly  small  fraction  of  all  possible  switching 
functions  as  the  number  of  inputs  gets  large. 
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patterns  or.  a  one-pass  basis  such  that  the  iterative  process  cc;  ges 
statistically.  Combinations  of  these  methods  can  be  accommodated  simul¬ 
taneously.  After  ^raining,  the  machine  can  be  used  to  classify  the 
original  patterns  and  noisy  or  distorted  versions  of  these  patterns. 

A  block  schematic  of  Adallns  la  shown  in  Fig.  3*  In  the  actual 
machine,  the  quantizer  is  net  built  in  as  a  device  but  is  accomplished 
by  the  operator  in  viewing  the  output  meter.  Different  quantizers 
(2-level,  3-level,  4-level)  are  realized  by  using  the  appropriate  meter 
scales  (sec  Fig.  2).  Adailne  can  be  used  to  classify  patterns  into 
several  categories  by  using  multi-level  quontizerc  and  by  following 
exactly  the  same  adapting  procedure. 

The  following  is  a  description  of  the  iterative  searching  routine. 

A  pattern  is  fed  to  the  machine,  and  the  reference  switch  is  set  to 
correspond  to  the  desired  output.  The  error  (see  Fig.  3)  Is  then  read 
(by  switching  the  reference  switch;  the  error  voltage  appears  on  the 
meter,  rather  than  the  neuron  output  voltage).  All  gains  including  the 
level  are  to  be  changed  by  the  same  absolute  magnitude,  such  that  the 
error  is  brought  to  zero.  This  is  accomplished  by  changing  each  gain 


FIG.  3. --SCHEMATIC  OF  ADALINE. 


-  6  - 


(which  could  be  positive  or  negative)  in  the  direction  which  will  diminish 
the  error  by  un  amount  which  reduces  the  error  magnitude  by  l/l7.  The  17 
gains  may  be  ’hanged  in  any  sequence,  and  after  all  changes  are  made,  the 
error  for  the  present  input  pattern  is  zero.  Switching  the  reference  back, 
the  meter  reads  exactly  the  desired  output.  The  next  pattern,  and  ii.e 
desired  output,  is  presented  and  the  error  is  read.  The  same  adjustment 
routine  is  followed  and  the  error  is  brought  to  zero.  If  the  first  pat¬ 
tern  wore  reapplied  at  this  point,  the  error  would  be  small  but  not 
necessarily  zero.  More  patterns  are  inserted  in  like  manner.  Convergence 
is  indicated  by  small  errors  (before  adaption),  with  small  fluctuations 
about  a  stable  root  mean -square  value.  The  iterative  routine  is  purely 
mechanical,  and  requires  no  thought  on  the  part  cf  the  operator.  Electronic 
automation  of  this  procedure  will  be  discussed  below. 

Tiie  results  of  a  typical  adaption  on  six  noiseless  patterns  is  given 

in  Figs.  4  and  5-  The  patterns  were  selected  in  a  random  sequence,  and 

were  classified  into  3  categories.  Each  T  wao  to  be  mapped  to  +60  on  the 

meter  dial,  each  G  to  0,  and  each  F  to  -60.  As  a  measure  of  performance, 

after  each  adaptation,  all  six  patterns  were  rend  in  (without  adaptation) 

2 

and  six  errors  were  read.  The  sum  of  their  squares  denoted  by  was 
computed  and  plotted.  Figure  5  shows  the  learning  curve  for  the  case  in 
which  all  gains  were  initially  zero. 


IV.  STATISTICAL  THEORY  OF  ADAPTION  FOR  SAMPLED-DATA  SYSTEMS 

This  section  is  a  summary  of  the  portions  of  Widrow's  statistical 
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theory  of  adaption  for  sampled-data  systems  '  that  is  useful  in  the 
analysis  of  adaptive  switching  circuits. 

Consider  the  general  linear  sampled-data  system  former:  of  a  tapped 
delay  line,  shown  in  Fig.  6.  This  system  is  intended  to  be  a  statistical 
predictor.  The  present  output  sample  g(m)  is  a  linear  combination  of 
present  and  past  input  samples,  and  is  Intended  to  approximate  as  closely 
au  possible  the  next  input  sample  f(m  +  1).  The  constants  in  this  linear 
combination  are  hg,  hy  etc.,  the  predictor  impulse -response  samples, 
or  the  gains  associated  with  the  delay-line  tape.  Their  choice  constitutes 
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Predictor 


FIG.  6. --AN  ADJUSTABLE  SAMPLED- DATA  PREDICTOR. 

the  adjustable  part  of  the  predictor  design.  They  nay  be  adjusted  In  the 
following  manner.  Apply  a  mean  square  reading  meter  to  e(ir),  the  differ¬ 
ence  between  the  present  input  and  the  delayed  prediction.  This  meter 
will  measure  mean  square  error  in  prediction.  Adjust  h^,  h^,  h,, ..., 
until  the  meter  reading  is  minimized. 

The  problem  of  adjusting  the  h's  1b  not  trivial,  because  their  effects 
upon  performance  Interact.  Suppose  that  the  predictor  has  only  tvo  im¬ 
pulses  on  its  impulse  response,  h1  and  hg.  Tne  mean  square  error  for  any 
setting  of  h^  and  hg  can  be  readily  derived: 

€(m)  =  f(m)  -  h^m  -  1)  -  h2f(m  -  2) 

?(■)  -  0ff(°)hl  +  0fT(O)h2  '  *  20ff(2)h2 

+  2^ff(D\h2  +  0ff(O)  (1) 

The  discrete  autocorrelation  function  of  the  input  is  0ff(j). 

The  mean  square  error  given  by  equations  (l)  is  what  the  mean  square 
meter  would  read  if  it  were  to  average  over  very  large  sample  size.  The 
mean  square  error  is  a  parabolic  function  of  the  predictor  adjustments 
and  h  ,  and,  in  general,  can  easily  be  shown  to  be  a  quadratic  function 
of  such  adjustments,  regardless  of  how  many  there  are. 

The  optimum  n-impulse  predictor  can  be  derived  analytically  by  set¬ 
ting  the  partial  derivatives  of  7^  of  equation  (l)  equal  to  zero.  This 

g 

is  the  discrete  analogue  of  Wiener's  optimization  of  continuous  filters. 
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Finding  the  optimum  system  experimentally  is  the  came  as  finding  a  min¬ 
imum  of  a  puraboloio.  in  n  dimensions.  This  could  be  done  manually  Ly  having 
a  human  operator  read  the  meter  and  set  the  adjustment,  or  it  could  he  dene 
automatically  by  making  use  of  any  one  of  several  iterative  gradient  meth¬ 
ods  for  surface-searching,  as  devised  by  numerical  analysts.  When  either 
of  these  schemes  is  employed,  an  adaptive  system  results  that  consists 
essentially  of  a  "worker"  and  a  "boss".  The  worker  in  this  case  predicts, 
whereas  the  boss  has  the  job  of  adjusting  the  worker. 

Figure  7  is  a  blc^k-diagram  representation  ol‘  such  a  basic  adaptive 
unit.  The  boss  continually  seeks  a  better  worker  by  trial  and  error  ex¬ 
perimentation  with  the  structure  of  the  worker.  Adaption  is  a  multi¬ 
dimensional  performance  feedback  process.  The  "error"  signal  in  the 
feedback  control  sense  is  the  gradient  of  mean  square  error  with  respect 
to  adjustment. 

Many  of  the  commonly  used  gradient  methods  search  surfaces  for 
stationary  points  by  making  changes  in  the  independent  variables  (starting 
with  an  initial  guess)  in  proportion  to  measured  partial  derivatives  to 
obtain  the  next  guess,  and  so  forth.  These  methods  give  rise  to  geometric 
(exponential)  decays  in  the  independent  variables  as  they  approach  a  sta¬ 
tionary  point  for  second-degree  or  quadratic  surfaces.  One-dimensional 
surface-searching  is  illustrated  in  Fig.  8. 

The  surface  being  explored  in  Fig.  8  is  given  by  Eq.  (2).  The  first 
and  second  derivatives  are  given  by  Eq.  (3)  and  (t). 

y  =  a(x  -  b)2  +■  c  (2) 

g  =  2a(x  -  b)  (3) 

4  ■  2«  eo 

dx 

A  sampled-data  feedback  model  of  the  iterative  process  is  shown  in 
Fig.  8b.  Each  time  a  guess  in  x  is  to  be  made,  the  derivative  is  measured 
physically  whereas  in  the  model  it  is  formed  as  a  quantity  proportional 
to  x  (according  to  Eq.  3). 
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f;g.  7. --an  adaptive  predictor. 


FIG.  8 . -  -  ONE- DIMENSIONAL  SURFACE  SEARCHING. 
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The  numerical  sequence  at  the  point  x(i)  begins  with  the  initial 
,;uess  and.  proceeds  as  a  sampled  transient  that  relaxes  geometrically 
toward  the  stationary  point,  exactly  like  the  sequence  of  guesses  in  the 
surface  exploration. 

The  flow-graph  can  be  reduced,  and  the  transfer  function  from  any 
point  to  any  other  point  can  thus  be  found.  The  resulting  characteristic 
equation  is 

-(Pak  +  l) z"1  +1-0 
The  iterative  process  is  stable  when 

0  >  k  >  -  -  (5) 

a 

In  order  to  choose  the  "loop  gain"  k  to  get  a  specific  transient  decay 
rate,  one  would  have  to  measure  the  second  derivative  (2a)  at  some  point 
on  the  curve.  Transients  decay  completely  in  one  step  when 


Derivatives  are  measured  in  the  actual  adaptive  Bystem  by  varying 
the  h's  by  fixed  increments  and  subtracting  measured  values  of  mean  square 
error  based  on  a  sample  size  of  N  samples.  "Noise"  in  the  ..measurements  of 
the  mean  square  error  surface  due  to  small  sample  size  cause  noisy  deri¬ 
vative  measurements.  These  noises  enter  the  adaption  process,  as  indi¬ 
cated  in  Fig.  3o,  and  cause  noisy  system  adjustments.  The  larger  the 
sample  size  taken  per  derivative  measurement,  the  less  is  the  noise.  The 
slower  the  adaptation,  the  more  precise  it  is.  The  faster  the  adaptation, 
the  more  noisy  (and  poor)  are  the  adjustments. 

Consider  that  the  adaptive  model  has  only  a  single  adjustment.  A 
plot  of  mean  square  error  versus  h^  for  this  simplest  system  would  be  a 
parabola,  analogous  to  the  parabola  of  Fig.  8.  Noise  in  the  system 
adjustment  causes  loss  in  steady-state  performance.  It  i»  useful  to 
define  a  dimensionless  parameter  M  the  "misadjustment",  as  the  ratio  of 
the  mean  increase  in  mean  square  error  to  the  minimum  mean  square  error. 

It  is  a  measure  of  how  the  system  performs  on  the  average,  after  adapting 
transients  have  died  out,  compared  with  the  fixed  optimum  system.  With 
regard  to  the  curve  of  Fig.  8, 
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(6) 


Variance  in  x  nbout  the  optimum  value  causes  the  average  ox"  y  to  be 
greater  than  the  mirimum  value  c.  The  increase  in  y  equals  the  variance 
in  x  multiplied  by  a,  as  can  be  seen  from  Eq.  (2). 

More  detailed  derivations  of  misadjustment  formulas  covering  several 
different  methods  of  surface  searching  and  derivative  measurement  are 
presented  in  Refs.  7  and  8.  The  particular  formulas  which  can  be  applied 
to  the  analysis  of  adaptive  switching  circuits  are  the  following. 

When  derivatives  are  measured  by  data  repeating,  i.e.,  when  the  same 
system  ir.put  data  is  applied  for  both  the  N  "forward"  and  the  N  "backward" 
measurements  of  mean  square  error,  the  misadjustment  given  by 


M  = 


i 

2[rfry 


(?) 


r  is  the  time  constant  of  the  iterative  process  of  Fig.  8,  and  is 
enua]  to  -(l/2ak).  A  unit  time  constant  means  that  the  adjustment  error 
decreases  ,  y  a  factor  l/e  per  iteration  cycle.  Equation  (7)  is  conserva¬ 
tive,  and  appreciably  so  only  for  small  values  of  t,  leBS  than  1.  In  the 
limiting  case  of  one-step  adaption,  t  =  0  and  the  appropriate  misadjust- 
ment  formula  is 


In  deriving  Formulas  (7)  and  (8),  it  has  been  assumed  that  the  error 
samples  are  gaussian  distributed,  with  zero  mean,  and  are  uncorrelated. 

It  can  be  shown  that  these  resultB  are  highly  insensitive  to  this  distri¬ 
bution  density  shape,  and  are  appreciably  affected  by  correlation  only 
when  it  exceeds  0.8. 

It  is  interesting  to  note  that  the  quality  of  adaption  depends  only 
on  the  number  of  samples  "seen"  by  the  system  in  adapting.  When  Eq.  (7) 
applies,  the  (Nt)  product  determines  the  misadjustment.  This  product  is 
equal  to  the  number  of  samples  seen  per  time  constant  of  adaptation.  If 
it  may  be  considered  that  transients  die  out  within  two  time  constants, 
then  the  mis  adjustment  equals  the  reciprocal  of  the  number  of  samples  ttiat 
elapse  In  adapting  t0  a  s+c-f  change  In  process.  This  statement  Is  obvi¬ 
ously  the  case  when  Eq.  (8)  applies. 


-  13  - 


The  expressions  (7)  and  (8)  are  based  on  the  supposition  that  fresh 
data  la  brought  In  for  each  cycle  of  Iteration.  If  the  system  adapts  on 
a  fixed  body  of  N  error  samples,  either  by  adapting  with  the  one-step 
procedure  and  stopping,  or  by  repeating  the  sane  data  front  iteration  cycle 
to  Iteration  cycle  for  several  time  constants  and  then  stopping,  the  re¬ 
adjustment  is  given  by  Formula  (8). 

When  there  are  m  interacting  adjustments  instead  of  Just  one,  Expres¬ 
sions  (7)  and  (8)  may  be  generalized  by  multiplication  by  m.  Multi¬ 
dimensional  one-step  surface  searching  may  be  accomplished  by  Nevton's 
method.  Multi-step  searching  may  be  conveniently  achieved  by  means  of 
the  method  of  steepest  descent  (making  changes  in  adjustment  in  the 
direction  of  the  surface  gradient  sued  in  proportion  to  its  majmltude)  or 
by  the  Southwell  relaxation  method  (cyclic  adjustment  for  minima,  one 
coordinate  at  a  time). 


V.  STATISTICAL  THEORY  CF  ADAPTION  FOR  THE  ADAPTIVE  HEITON  ELEMENT 


The  error  signal  measured  and  used  in  adaption  of  the  neuron  of 
Fig.  1  is  the  difference  between  the  desired  output  and  the  sum  before 
quantization.  This  error  is  indicated  by  e  in  Fig.  9.  The  actual  neuron 
error,  indicated  by  in  Fig.  9,  is  the  difference  between  the  neuron 
output  and  the  desired  output. 

The  objective  of  adaption  is  the  following.  Given  a  collection  of 

input  patterns  and  the  associated  desired  outputs,  find  the  best  set  of 

“2 

weights  art,  a,,... a  to  minimize  the  mean  square  of  the  neuron  error,  c  . 

u  1  m  n 

Individual  neuron  errors  could  only  have  the  values  of  +2,  0,  and  -2  with 

“2 

a  two-level  quantizer.  Minimization  of  is  therefore  equivalent  to 

minimizing  the  average  number  of  neuron  errors. 

The  simple  adaption  procedure  described  in  this  paper  minimizes  c 

rather  than  £q.  The  measured  error  e  has  zero  mean  (a  consequence  of  the 

minimization  of  )  and  will  be  assumed  to  be  gausslan-dlstributed.  By 

uiaklng  use  of  certain  geometric  arguments  or  by  using  a  statistical  theory 

10  - 2 

of  amplitude  quantization,  it  can  be  shown  that  €  is  a  monotonic  function 

-7  -5  n  -5 

of  ec>  and  that  minimization  of  e  ie  equivalent  to  minimization  of 


%  Probability  of  Nauron  arror 


and  to  minimization  of  the  probability  of  neuron  error.*  The  ratio  of 
these  mean  squares  has  been  calculated  and  is  plotted  in  Fig.  9  as  a 
function  of  the  neuron  error  probability  (which  Is  e  /**) • 

Given  any  collection  of  input  patterns  and  the  associated  desired 
outputs,  the  measured  mean  square  error  must  be  a  precisely  parabolic 
function  of  the  gain  settings,  Oq,  ...a^.  Let  the  kt*1  pattern  be  indicated 
as  the  vector  S(k)  =  s^k),  s2(k), . .  .3n(k)  •  The  s's  have  values  of  +  1  or 
-1,  and  represent  the  n  input  components  numbered  in  a  fixed  manner.  The 
kth  error  is 

t(k)  -  d(k)  -  aQ  -  a^k)  -  *2&2(k) . a^k)  (y) 

For  simplicity,  let  the  neuron  have  only  two  input  lines  and  c  level 
control.  The  square  of  the  error  is  accordingly 

e2(k)  =  d2(k)  +  a2  +  s‘^(k)a2  +  s2(k)a2 

-  2d(k)aQ  -  2d(k)s1(k)a1  -  2d(k)s2(k)a2 

+  So^kja^  +  2s2(k)aQa2  +  2s1(k)s2(k)aia2  (10) 

The  mean  square  error  averaged  over  k  is 

e5  ■  a2  +  0(8^  s1)a2  +  0(s?,  s2)a2  -  daQ 

-  20(d,  s1)a1  -  20(d,  s2)a2  +  28^^  + 

+  20(a1,  °2)aia2  +  0(d>  4)  (11) 

The  0's  are  spatial  correlations.  0(8^  s2)  =  s^,  etc.  Note  that 

0(5j’  "j*  ’  Vi  ‘ l-  -s 

Adjusting  the  a's  to  minimize  tr  is  equivalent  to  searching  a  para¬ 
bolic  stochastic  surface  (having  as  many  dimensions  as  there  are  a's)  for 
a  minimum.  How  well  this  surface  can  be  searched  will  be  limited  by 
sample  size,  i.e.,  by  the  number  of  patterns  seen  in  the  searching  process. 
— 

The  probability  of  neuron  error  is  minimized  by  the  adaption  pro¬ 
cedure  subject  to  the  restriction  th&r.  7  =  0.  This  does  not  preclude 
the  possibility  that  the  error  probability  could  bo  even  less  with 
neuron  adjustments  that  will  not  cause  7  to  be  zero. 
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The  method  of  searching  that  has  proven  moot  useful  is  the  method 
of  steepest  descent.  Vector  adjustment  changes  are  made  in  the  direction 
of  the  gradient  The  change  made  in  io  proportional  to  the  partial 
derivative  of  with  respect  to  aQ,  etc.  The  partial  derivatives  are 
determined  at  one  point,  then  the  changes  in  all  adjustments  arc  trade 
simultaneously.  This  completes  one  iterative  cycle.  The  process  1b  then 
repeated.  Transients  decay  along  each  adjusting  coordinate  in  relaxation 
toward  the  stationary  point.  They  rone  let  of  sums  of  geometric  sequence 
components  (there  are  as  many  natural  "frequencies"  as  the  number  of 
adjustments,  as  can  be  seen  from  generalization  of  the  flow  graph  of 
Fig.  0  —  see  Ref.  9).  If  the  proportionality  constant  k  between  partial 
derivative  at.d  size  of  change  is  made  sufficiently  smill,  transients  will 
be  stable  Just  how  big  this  constant  could  be  for  stable  searching  de¬ 
pends  upon  the  surface  characteristics  (i.e.,  upon  pattern  characteristics). 
It  can  be  shown,  however,  that  when  all  second  partial  derivatives  are 
equal  (differentiation  of  Eq.  11  shows  them  all  to  have  the  value  2),  the 
method  of  steepest  descent  will  be  stable  when  the  proportionality  con¬ 
stant  k  is  less  than  the  reciprocal  of  the  second  partial  derivative.  It 
car.  also  be  shown  that  when  k  is  small,  transients  can  be  well  represented 
as  being  of  a  single  time  constant.  This  time  constant  1b  somewhat  sensi¬ 
tive  to  the  specific  pattern  information,  but  generally  turns  out  to  equal 
l/2k. 

When  partial  derivatives  are  measured  by  averaging  over  only  a  few 
patterns  each  iteration  cycle,  the  measurements  will  be  noisy,  and  tran¬ 
sients  will  be  noisy  exponentials .  Stability  arid  time  constant  will  re¬ 
main  dependent  on  k  and  the  properties  of  the  large-sample-size  mean- 
square-error  surface. 

The  method  of  adaption  that  has  been  used  requires  an  extremely  small 
sample  size  per  iteration  cycle,  namely  one  pattern.  One-pattem-at-a-time 
adaption  has  the  advantages  that  derivatives  are  extremely  easy  to  measure 
and  that  no  storage  is  required  within  the  adaptive  machinery  except  for 
the  gain  values  (which  contain  the  past  experience  of  the  neuron) . 

The  square  of  the  error  for  a  single  pattern  (the  mean  square  error 
for  a  sample  size  of  one)  is  given  by  Eq.  (10).  The  partial  derivatives 
are 
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f  -2d( it )  +  i.*a0  ♦  -'81(k)a1  +  2a?(k)u2] 

8l(k)[-2d(k)  +  2aQ  +  2»]U)ai  +  2s.,(k)a2] 

s0(k)  [  -2d(k)  +  ?aQ+  28^^  2s2(k)a2]  (12) 

the  Eqs ,  (12)  with  Eq.  (9)  shows  that  the  derivative*  are 
simply  related  to  the  measured  error,  and  suggest  thaw  the  derivatives 
could  he  measured  without  squaring  and  averaging  and  vlthout  actual 
differentiation.  The  Jth  partial  derivative  ia  given  by 

2 

%-^--2s.(k)  s(k)  (13) 

oa  j  J 

It  follows  that  all  derivatives  have  the  same  magnitude,  and  have  signs 
determined  by  the  error  sign  and  the  respective  input  signal  signs. 
Application  of  the  method  of  steepest  descent  requires  that  all  gain 
changes  in  a  given  iteration  cycle  have  the  same  magnitude  and  the  appro¬ 
priate  sign.  Each  gain  change  reduces  the  error  magnitude  by  the  same 
amount.  The  procedure  described  in  Sec.  C  for  bringing  e(k)  to  zero  with 
each  successive  input  pattern  gives  the  constant  k  a  value  of  l/2(iwl). 
From  the  previous  discussion  ve  see  that  the  time  constant  of  the  itera¬ 
tive  process  is  therefore  t  =  (n  +  l)  patterns.  On  the  ta1*  Adaline,  there 
are  n  -  16  input  line  gains  plus  a  level  control.  Therefore,  the  time 
constant  should  be  roughly  17  patterns  (for  verification,  see  the  learning 
curve  of  Fig.  5).  The  search  procedure  could  be  readily  modified  to  speed 
up  or  slow  down  the  adaption  process.  For  example,  bringing  the  error 
c(k)  to  half  its  value  rather  than  to  zero  with  each  input  pattern  halves 
k  and  doubles  t. 

The  statistical  theory  of  adaption  for  sampled -data  systems  is  based 
on  search  of  multidimensional  stochastic  parabolic  surfaces  for  stationary 
points.  The  mlsadjustment,  a  dimensionless  measure  of  how  well  a  system 
will  adapt,  is  defined  as  the  ratio  of  the  mean  increase  in  mean  square 
error  (due  to  searching  the  surfuc*  with  small -sample -size  data)  to  the 
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d^(k)  = 
0*2 


Comparison  of 


minimum  mean  square  error  (a  performance  reference  that  could  only  he 
achieved,  with  perfect  knowledge  of  input  process  statistics).  The  mis¬ 
adjustment  Formulas  (7)  and  (8)  apply  directly  to  the  adaptive  neuron. 

The  misadjustment  formulas  give  the  per  unit  increase  in  measured 
mean  square  error  as  a  result  of  adapting  on  a  finite  number  of  patterns. 
Since  the  ratio  of  probability  of  neuron  error  to  the  mean  square  error 
7^  is  essentially  constant  over  a  wide  range  of  error  probabilities  (Fig. 9), 
the  misadjustment  as  expressed  by  Formulas  (7)  and  (8)  may  be  interpreted 
in  terms  of  the  ratio  of  the  increase  in  error  probability  to  the  minimum 
error  probability. 

If  adaption  is  accomplished  by  injection  of  a  fresh  pattern  each 
iteration  cycle,  the  mean  values  of  the  gains  will  converge,  after  adapting 
transients  have  died  out,  on  the  best  set  of  values  ror  large  sample  size. 
The  actual  gain  settings  will  experience  random  excursions  about  these 
values,  and  the  resulting  misadjustment,  as  derived  from  Eq.  (7)  is 


M  = 


(n  +  1) 

St 


(Ik) 


Following  the  procedure  of  bringing  e(k)  to  zero  each  iteration  cycle, 
the  misadjustment  Is 


M 


(n  +1)  (n  +  1) 
2t  ~  2(11+1) 


1 

2 


(15) 


If  adaption  is  accomplished  by  taking  a  fixed  collection  of  N  pat¬ 
terns  and  repeating  them  over  and  over  for  several  time  constants  (where 
the  time  constant  is  long,  several  times  N),  the  gains  will  stabilize  on 
the  best  set  of  values  for  the  N  patterns.  In  general,  these  gains  will 
not  he  the  best  for  the  large  collection  of  patterns  that  the  N  patterns 
were  abstracted  from.  Making  use  of  Eq,  (8),  the  misadjustment  is 

m = ^ ^  (16) 


An  extensive  series  of  simulation  studies  has  been  made  to  test  the 
validity  of  the  misadjustment  Formulas  (l4)  and  (15).  These  tests  have 
shewn  that  the  formulas  are  highly  accurate  over  a  very  wide  range  of 
pattern  and  noiBe  characteristics.  A  description  of  3  typical  experiment 

and  its  results  is  given  ir.  Fig.  10. 
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Best  neuron  makes  12  errors  out  of  IOO 


FIG.  10 .  -  -  EXPERIMENTAL  ADAPTION  ON  10  NOISY  3*3  PATTERNS. 
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Noisy  3x3  patterns  were  generated  DY  randomly  injecting  errors  in 
ter.  percent  of  the  positions  of  the  four  "pure"  patterns,  X,  T,  C,  .T. 

These  latterus,  shown  in  Fig.  10,  are  ordered  for  convenience  in  checking. 
They  were  fed  manually  to  Adaline  and  chosen  randomly  by  looking  up  their 
Identification  numbers  ir.  a  random  number  table.  The  X's  (numbered,  from 
left  to  right,  up  to  down)  were  ^umbered  1  to  25,  the  T's  were  25  to  50,  the 
C's  were  50  to  75,  and  the  J's  were  75  to  ICO. 

The  best  system  was  arrived  at  by  slow  precise  adaption  on  the  full 
body  of  100  noisy  patterns,  repeating  them  over  arid  over  several  times. 

This  system  was  able  to  classify  the  patterns  as  desired,  except  for 
twelve  errors  out  of  the  100  total.  The  gains  were  then  set  to  zero  and 
ten  patterns  were  chosen  at  random.  The  best  system  for  the  ten  selected 
patterns  was  arrived  at  by  slow  adaption  on  these  patterns,  repeating 
them  over  and  over  several  times .  The  resulting  system  was  then  tested 
on  the  full  body  of  100  patterns,  and  25  classification  errors  out  of 
100  were  made.  This  number  of  errors  was  more  than  twice  that  made  by 
the  best  system  adapted  on  100  patterns.  The  misadjustment  was  108  per¬ 
cent.  This  small-aample-size  adaptation  experiment  was  repeated  three 
more  times,  and  the  misadjustments  that  resulted,  in  order,  were  58  per  cent, 
67  per  cent  and  133  per  cent.  Since  N  =  10  patterns  and  n  •  9  input  lines, 
the  theoretical  misadjustment  was 

M  =  n  *  1  =  100  per  cent 

An  average  taken  over  the  four  experiments  gives  a  measured  misadjustment 
of  91*5  per  cent. 

The  adaptive  classifier  can  adapt  after  seeing  remarkably  few  pat¬ 
terns.  A  misadjustment  of  20  per  cent  should  be  accept  ible  in  most 
applications.  Tc  achieve  this,  all  one  has  to  do  is  supply  the  adaptive 
classifier  with  a  number  of  patterns  equal  to  five  times  the  number  of 
input  lines,  regardless  of  how  noisy  the  patterns  are  and  how  difficult 
the  "pure"  patterns  are  to  separate.  Although  the  misadjustment  formulas 
nave  been  derived  for  the  specific  classifier  consisting  of  a  single 

J 

adaptive  neuron,  it  is  suspected  that  the  following  "rule  of  thumb"  will 
apply  fairly  well  to  all  adaptive  classifiers:  the  number  of  patterns 
required  to  train  an  adaptive  classifier  is  equal  to  several  times  the 


number  of  bits  per  pattern. 
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output.  The  read-out  oscillator  provides  the  a-c  current  to  operate  the 
read-out  wires.  The  read-out  amplifier  converts  the  a-c  output  signals 
to  a  a-c  output  signal.  The  summer  (Z),  computes  the  error  e(k).  The 
error  sign  circuit  computes  sgn  e(k).  The  error  magnitude  circuit  provides 
a  signal  which  blocks  the  "and"  gate  when  the  magnitude  of  the  error  falls 
below  a  preset  level,  thus  preventing  tne  operation  of  the  adapt-drive 
circuit.  Therefore,  when  the  "Adapt"  signal  is  applied,  the  adupt-drive 
circuit  is  repeatedly  energized  until  the  error  falls  below  the  preset 
level.  The  de?ny  circuit  controls  the  amount  of  time  between  energizations 
of  the  adapt-drive  circuit.  This  time  must  be  long  enough  to  allow  the 
error  to  reach  its  new  value  after  each  energization. 

When  networks  of  neurons  are  used,  it  is  possible  that  a  single  set 
of  driving  circuits  could  be  employed  to  actuate  all  of  the  adaptive 
neurons.  At  present,  this  is  not  practical  for  large  networks  because  of 
the  power  levels  required.  The  MAD  elements  shown  in  Fig.  1.4  are  quite 
large,  and  might  ultimately  oe  able  to  be  made  much  smaller,  perhaps  in 
the  form  of  thin  films.  It  should  ultimately  be  possible  to  mass  produce 
large  networks  of  adaptive  microelectronic  logical  elements.  Power  levels 
should  be  low,  space  and  weight  requirements  and  cost  should  be  low. 

These  neurons  should  be  thought  of  and  treated  as  new  kinds  of  circuit 
elements,  adaptive  logical  components. 


VIII.  APPLICATIONS  FOR  ADAPTIVE  LOGICAL  CIRCUIT  ELEMENTS. 


Tne  field  of  application  of  digital  systems  may  be  classified  into 
two  broad  categories,  fixed  systems  and  adaptive  systems.  The  structure 
of  the  fixed  system  is  completely  determined  by  the  designer,  while  the 
adaptive  system  is  designed  to  have  both  fixed  and  adjustable  portions. 
The  latter  system  has  the  ability  to  automatically  modify  its  adjustable 


parts  by  trial  and  error  experience  in  order  to  optimize  performance 
(this  is  performance  feedback).  Fixed  systems  are  by  far  the  most  common 
at  present.  Adaptive  systems  have  received  intensive  study  during  the 
past  several  years,  and  come  practical  applications  are  being  made  in 
automatic  control  and  in  the  recognition  (classification)  of  pattern 
information. 
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Both  sets  of  patterns  were  fed  to  two  Adalines  simultaneously  and  perfect 
adaption  was  possible.  The  adaption  procedure  was  the  following:  if  the 
desired  output  for  a  given  input  pattern  applied  to  both  machines  was  -1, 
then  both  machines  were  adapted  in  the  usual,  manner  to  ensure  this;  if 
the  desired  output  was  +1,  the  machine  with  the  smallest  measured  error 
£  was  assigned  to  adapt  to  give  a  +1  output  while  the  other  machine  re¬ 
mained  unchanged.  If  either  or  both  machines  gave  outputs  of  +1,  the 


pattern  was  classified  us  tl.  If  both  machines  gave  -i  G'uupuoS,  oiie  pat¬ 
tern  was  classified  as  -1. 

This  procedure  assigns  specific  "responsibility"  to  the  neuron  that 
can  most  easily  assume  it.  If,  at  the  beginning  of  adaption,  a  given 
neuron  takes  responsibility  for  producing  a  +1  with  a  certain  input  pattern, 
it  will  invariably  take  this  responsibility  each  time  the  pattern  is  applied 
during  training.  Notice  that  it  is  not  necessary  for  a  teacher  to  assign 
responsibility.  The  combination  does  this  automatically  and  requires  only 
input  patterns  and  the  associated  desired  outputs,  like  the  single  neuron. 

More  complicated  problems  can  be  well  solved  by  combinations  of  many 
neurons.  Their  inputs  are  connected  in  parallel  while  their  outputs  are 
connected  to  an  OR  element.  The  only  new  requirement  is  that  of  the  job 
assigner,  vrhich  is  simple  to  implement.  Such  combinations  greatly  in¬ 
crease  the  generality  of  the  classification  scheme,  and  the  ease  of 
adaption  is  comparable  to  chat  of  a  single  neuron.  A  theory  of  adaption 
for  these  combinations  has  yet  to  be  completed.  Preliminary  considera¬ 
tions  indicate  that  the  mis adjustment  formulas  will  apply  without  appre¬ 
ciable  change  when  combinations  of  neurons  adapt  on  noisy  nonlinearly 
separable  patterns. 

Various  classification  problems  could  be  solved  simultaneously  by 
multiplexing  neurons  or  combinations  of  neurons.  One  neuron  might  be 
trained  to  decide  whether  the  man  in  a  given  picture  does  or  does  not 
have  a  green  tie,  while  another  neui'on  or  combination  could  be  trained  to 
decide  whether  or  not  the  man  has  a  checkered  shirt.  Each  neuron  or 
combination  has  its  own  output  line,  and  each  is  fed  the  appropriate 
desired  output  signal  during  training.  The  input  signals  are  common  to 
all  neurons.  In  this  manner,  it  is  possible  to  form  adaptive  classifiers 
that  can  separate  with  great  accuracy  large  quantities  of  complicated 
patterns  into  many  output  categories.  All  that  is  needed  is  large  quantities 
of  adaptive  neurons. 
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VII.  ADAPTIVE  MICROELECTRONIC  SYSTEMS 


The  structure  of  the  neuron  described  in  this  report  and  itE  adaption 
procedure  is  sufficiently  simple  that  an  effort  is  under  way  to  develop 
a  physical  device  which  is  an  all-electronic  fully  automatic  Adeline. 

The  objective  is  a  self-contained  device,  like  the  one  sketched  in  Fig.  11, 
that  has  a  signal  input  line,  a  "desired  output"  input  line  (actuated 
during  training  only),  an  output  line,  and  a  pover  supply.  The  device 
itself  should  be  suitable  for  mass  production,  should  contain  few  parts, 
should  be  reliable,  and  probably  should  consist  of  solid-state  components. 


output 

FIG.  11.  •  -  ELECTRONIC  AUTOMATICALLY- ADATED  NEURON. 

To  have  such  an  adaptive  neuron,  it  is  necessary  to  be  able  to  store 
the  gain  values,  which  could  be  positive  or  negative,  in  such  manner  that 
these  values  could  be  changed  electronically. 

Present  efforts  have  been  based  on  the  use  of  multi -aperture  magnetic 
cores  (MAD  elements1*).  The  special  characteristics  of  these  cores  permit 
multilevel  storage  with  continuous,  non-destructive  read  out.  In  addition, 
the  stored  levels  are  easily  changed  by  small  controlled  amounts,  with  the 
direction  of  the  change  being  determined  by  logic  performed  by  the  MAD 
element. 

Figure  12  shows  a  block  diagram  for  an  electronic  adaptive  element, 
which  realizes  the  adaption  technique  described  previously.  The  MAD 
element  array  contains  magnetic  cores  and  wire  only;  Fig.  13  shovs  how 


-  25  - 


FIG.  IS. --MAD  ELEMENT  —  WINDINGS  FOR  USE  IN  ELECTRONIC  ADAPTIVE  ELEMENTS. 
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each  core  is  wired.  A  photograph  of  the  first  experimental  array  for  a 
5xcJ  Adallne  Ib  Fhovn  in  Fig.  l*t.  This  array  performs  the  following 
functions: 

1.  Storage  ot'  the  gains  and  the  d-o  level  (the  a^'s).  Thin  storage 
is  passive,  i.e.,  the  information  is  net  lost  In  the  event  of  power  ('n1 1  - 
ure.  There  is  one  MAD  element  for  each  gain  and  one  for  the  d-c  level. 

Thus  for  m  x  n  patterns  mn  +  1  MAD  elements  are  required. 

2.  Continuous  computation  of  the  sum  aQ  J-  £  Q^(h)  for  the  pattern 
connected  to  the  input.  The  sum  appears  as  two  u-c  signals,  one  appearing 
across  each  of  the  read-out  wireB.  The  sigrud  across  one  of  these  wires 
corresponds  to  the  sum  of  those  terms  for  which  the  s^k)  is  negative; 

the  other  corresponds  to  the  sum  of  the  d-c  level  and  those  terms  for 
which  the  s^(k)  i s  positive.  [Each  read-out  wire  carries  an  a-c  current. 

The  voltage  drop  per  core  on  a  read-out  wire  is  a  linear  function  of  the 
value  of  the  gain  stored  in  that  core,  provided  that  the  aperture  through 
which  the  wire  passes  is  not  blocked  by  energizing  the  block  winding  of 
that  aperture.  If  blocked,  the  voltage  drop  is  very  small.  Thus,  the 
summation  is  accomplished  by  energizing  the  "Block  (-)"  winding  of  the 
ith  core  when  s^k)  is  negative  and  energizing  the  "Block  (  +  )"  winding 
when  si(k)  is  positive.] 

3.  Computation  of  the  adaption  change  6a^  in  the  gain  .  Each  of 
these  changes  Is  proportional  to  the  product  s^kJsgnfeCk)].  (The  change 
in  the  stored  level  of  the  core  it  accomplished  by  applying  the  proper 
signal  to  the  "Adapt  Drive"  wire.  With  the  proper  adapt-drlve  waveform, 
the  direction  of  the  change  may  be  reversed  by  applying  a  d-c  bias  to  one 
of  the  "Input"  windings.  Input  1  is  energized  when  both  s^k)  and  e(k) 
are  positive;  Input  2  is  energized  when  both  are  negative.  For  s^k)  and 
e(k)  of  opposite  sign,  no  current  is  applied  to  either  input.]  All  of  the 
changes  6a^  are  of  the  same  magnitude.  To  reduce  the  error  to  approximately 
zero,  the  Adapt  Drive  wire  is  energized  a  sufficient  number  of  times.  The 
d-c  winding  on  the  MAD  element  carries  a  d-c  bias  current.  This  current 
may  be  removed  between  adapt-drlve  signals,  but  must  be  applied  during 

the  adapt-lrive  signal. 

The  peripheral  circuitry  supplies  the  necessary  signals  to  the  MAD 
element  array,  and  converte  *he  a-c  rrad-out  signals  to  a  more  useful  d-c 
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output.  The  read-cut  oscillator  provides  the  a-c  current  to  operate  the 
read-out  wi.es.  The  reud-oul  amplifier  converts  the  a-c  output  signals 
to  a  d-c  output  signal.  The  summer  (Z),  computes  the  error  e(k).  The 
error  sign  cir;uit  computes  sgn  e(k).  The  error  magnitude  circuit  provides 
a  signal  which  blocks  the  "and"  gate  when  the  magnitude  of  the  error  falls 
below  a  preset  level,  thus  preventing  the  operation  of  the  adapt-drive 
circuit.  Therefore,  when  the  "Adapt"  signal  is  applied,  the  adapt-drive 
circuit  is  repeatedly  energized  until  the  error  falls  below  the  preset 
level.  The  de? ay  circuit  controls  the  amount  of  time  between  energizations 
of  the  adapt-drive  circuit.  This  time  must  be  long  enough  to  allow  the 
error  to  reach  its  new  value  after  each  energization. 

When  networks  of  neurons  are  used,  it  is  possible  that  a  single  set 
of  driving  circuits  could  be  employed  to  actuate  all  of  the  adaptive 
npurons.  At  present,  this  is  not  practical  for  large  networks  because  of 
the  power  levels  required.  The  MAD  elements  shown  in  Fig.  1.4  are  quite 
large,  and  might  ultimately  be  able  to  be  made  much  smaller,  perhaps  in 
the  form  of  thin  films.  It  should  ultimately  be  possible  to  mass  produce 
large  networks  of  adaptive  microelectronic  logical  elements.  Power  levels 
should  be  low,  space  and  weight  requirements  and  cost  should  be  low. 

These  neurons  should  be  thought  of  and  treated  as  new  kinds  of  circuit 
elements,  adaptive  logical  components. 

VIII.  APPLICATIONS  FOR  ADAPTIVE  LOGICAL  CIRCUIT  ELEMENTS. 

Tne  field  of  application  of  digital  systems  may  be  classified  into 
two  broad  categories,  fixed  systems  and  adaptive  systems.  The  structure 
of  the  fixed  system  is  completely  determined  by  the  designer,  while  the 
adaptive  system  is  designed  to  have  both  fixed  and  adjustable  portions. 

The  latter  system  has  the  ability  to  automatically  modify  its  adjustable 
parts  by  trial  and  error  experience  in  order  to  optimize  performance 
(this  is  performance  feedback).  Fixed  systems  are  by  far  the  most  common 
at  present.  Adaptive  systems  have  received  intensive  study  during  the 
past  several  years,  and  come  practical  applications  are  being  made  in 
automatic  control  and  in  the  recognition  (classification)  of  pattern 
information. 
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Both  fixed  and  adaptive  systems  may  be  realized  by  programming  gen¬ 
eral  purpose  digital  computers.  These  computers  on  the  other  hand  are 
realized  of  conventional  logical  components  (flip-flops,  gates,  etc./, 
but  may  be  realized  of  networks  of  adaptive  neurons.  All  details  of 
organization,  design,  and  construction  of  computers  must  be  completely 
planned  in  the  present  day  scheme  of  things.  If  a  computer  were  built 
of  adaptive  neurons,  details  of  structure  could  be  imparted  by  the  de¬ 
signer  by  training  (showing  it  examples  of  what  he  would  like  it  to  do) 
rather  than  by  direct  designing.  This  design  concept  becomes  more  signi¬ 
ficant  as  size  and  complexity  of  digital  systems  increase.  The  demands 
of  modern  technology  are  such  that  larger  and  more  complex  digital  systems 
are  continually  being  contemplated,  and  in  step  with  this,  progress  in 
microelectronics  makes  such  systems  physically  and  economically  possible. 

The  problem  of  reliability  is  gieatly  aggravated  by  increase  in  size 
and  complexity.  Significant  steps  in  improving  the  reliability  of  digital 
systems  have  been  made,  notably  with  the  introduction  of  the  magnetic -core 
memory,  and  the  use  of  high-speed  switching  transistors  as  active  logical 
elements.  Although  the  reliability'  of  individual  components  has  constantly 
increased,  the  requirement  in  numbers  of  components  has  increased  in  many' 
cases  far  more  rapidly.  It  is  not  expected  that  mass-produced  micromin¬ 
iature  components  will  ever  be  perfectly  reliable,  yet  they  will  be  usable 
in  large  systems.  The  problem  is  to  devise  new  systems  techniques  to 
achieve  reliable  over-all  operation  where  systems  consist  of  large  numbers 
of  interacting  imperfect  components. 

Errors  caused  by'  computer  component  failure  are,  in  general,  more 
deleterious  to  a  fixed  system.  In.  t.he  event  of  a  failure,  the  adaptive 
system  will  adjust  whatever  remains  adjustable  to  do  the  "best"  job  with 
the  intact  parts.  As  long  as  the  adaption  mechanism  is  reliable,  system 

reliability  is  inherently  increased. 

12  3 

Shannon  and  Moore  and  von  Neumann  have:  proposed  schemes  for  making 
reliable  fixed  digital  systems  from  unreliable  components  by  using  redun¬ 
dancy.  Another  method,  using  adaptive  logic,  is  hereby  proposed  for 
improving  system  reliability. 

The  reliability  of  a  system  whose  purpose  is  non-adaptive  may  be  in¬ 
creased  by  combining  adaption  and  redundancy.  Consider  a  multiplex 
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•: ontf Luting  of  three  machines  solving  the  same  problem  with  the  3nme  input 
data.  Let,  the  output  of  each  machine  Le  a  single  binary  number,  expres¬ 
sed  as  +1  or  -L.  If  these  machines  were  perfectly  reliable,  their  outputs 
would  always  agree.  If  not,  then  von  Neumann  proposed  that  the  majority 
should  rule.  The  neuron  shown  in  Fig.  1  with  set  to  zero,  and  the 
other  gains  set  to  +1  would  give  a  majority  output.  Each  machine  has 
equal  vote.  Unequal  vote  (higher  vote  going  to  the  more  reliable  machine) 
is  possible  by  making  the  u's  adjustable,  and  caiming  these  adjustments 
lo  be  made  automatically  to  optlmiz  performance.  The  adaptive  note  taker 
is  identical  to  the  adaptive  neuron.  The  vote  taker  can  be  trained  by 
periodically  injecting  a  certain  input  when  the  desired  output  is  known. 

Kon  Neumann's  majority  rule  vote  taker  will  give  the  correct  outcome 
when  the  majority  is  correct.  The  adaptive  vote  taker  could  ideally  give 
the  correct  outcome  with  only  a  single  correct  machine  by  giving  it  a 
heavy  vote  and  attenuating  the  votes  of  the  unreliable  machines.  This  is 
in  effect  an  adaptive  routing  procedure  for  information  flow,  and  allows 
systems  in  a  small  measure  to  be  self-healing. 

The  effectiveness  of  the  adaptive  vote  taker  is  being  evaluated  by 
William  Pierce  in  a  doctoral  thesis  research  at  Stanford  University.  It 
has  been  shown  that  the  effective  multiplex  factor  can  be  greatly  in¬ 
creased  by  udaption  (particularly  where  the  machines  are  fairly  unreliable), 
and  that  system  life  expectance  can  also  be  greatly  increased  by  adaption 
and  redundancy.  This  work  will  be  described  in  a  Stanford  University 
teclmical  report. 

When  adaptive  neuron  elements  become  available  in  large  quantities, 
adaptive  1  ogica]  and  computing  systems  will  probably  be  organized  quite 
differently  from  the  way  moderr  computing  systems  are  organized.  The 
organizations  of  two  related  adaptive  system  types  will  be  considered, 
that  of  adaptive  pattern  classifiers  and  of  adaptive  problem-solving 
machines . 

13  l4 

The  realization  schemes  utilized  by  Clark  and  Farley,  Rosenblatt, 

5,6 

and  Mattson  for  adaptive  pattern  classifiers  made  use  of  digital  simu¬ 
lation.  The  approach  suggested  by  this  work  is  that  adaptive  pattern 
classifiers  be  constructed  of  networks  of  adaptive  neuron  elements. 
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One  of  Ihe  most  promising  areas  oi'  research  Ln  computer  system  tneory 
Is  that  of  probLem-scl/ir"  machines,  theorem-proving  machines,  and  arti¬ 
ficially  "Intelligent"  machines.  The  earliest  proponents  of  this  researcn 
vere  Turing1'  and  Shannon/0  Their  suggestions  were  successfully  put  to 
practice  by  Newell,  Simon,  and  31mv, 1 ^  by  Samuel,  and  by  others.  Problem- 
solving  has  been  regarded  as  r.  multistage  dedslor  process  which  begins 
with  ar.  initial  status  and  ends  with  a  goal  status.  Each  change  in  status 
results  from  the  ^election  of  a  certain  move  from  a  collection  of  possi¬ 
bilities  which  are  "legal"  according  to  the  rules  of  the  game.  Since  the 
number  of  chains  of  moves  increases  approximately  exponentially  with  the 
length  of  the  chains,  exnuustively  trying  all  chains  in  search  of  a  goal 
is  not  practical,  even  for  simple  problems. 

The  approach  taker,  by  Samuel  in  his  checker-playing  simulations  to 
reduce  the  number  of  chains  to  be  tested  was  two-fold.  The  length  of  the 
chains  was  limited  to  be  somewhere  between  ten  and  thirty  move6  ahead 
(a  "ply"  of  10  to  30),  and  since  most  chains  would  not  terminate  by 
reaching  goals,  a  system  of  status  evaluation  was  developed  so  that  the 
various  chains  could  be  numerically  compared.  The  second  method  of  re¬ 
ducing  the  number  of  chains  to  be  teBted  was  to  check,  against  gameB  stored 
in  the  memory.  If  an  identical  situation  was  encountered  previously, 
certain  evaluations  have  already  been  made  ar.d  need  not  be  repeated.  This 
use  of  stored  games  was  called  "rote  learning".  A  procedure  for  making 
cne  status  evaluation  system  adaptive  was  called  "generalized  learning". 

Both  of  these  learning  methods  could  be  used  simultaneously. 

The  rote  learning  portion  of  the  over-all  procedure  could  be  made  to 
be  much  more  powerful  if  it  were  possible  to  extract  from  the  memory  pre¬ 
vious  situations  that  are  similar  (are  not  necessarily  Identical)  to  the 
current,  situation.  Far  less  experience  and  storage  would  be  needed  to 
reacr.  a  given  level  of  competence  of  play.  Similar  means  that  the  pre¬ 
vious  situation  is  in  the  same  subclass  with  the  current  situation.  A 
classification  scheme  would  be  needed  to  establish  similarities  in  checker 
situations.  The  structure  of  this  classifier  would  have  to  be  formed 
from  experience. 

An  automatic  problem-solving  computer  should  have  a  memory  system 
from  which  information  could  be  extracted  recording  to  classification 
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mllier  Hum  by  address  number.  The  extent  of  classification  before 
storing  should  be  slight  (e.g.,  Is  the  pattern  of  checkers  or  of  chess?), 
and  a  consistent  scheme  for  the  arrungement  of  the  pattern  bits  should 
be  established  before  storing.  Final  classification  should  be  done  within 
the  memory  Itself.  Each  storage  register  should  contain  an  Adeline  or  a 
network  ol‘  Adalines. 

A  request  from  a  "central  control"  for  a  certain  type  of  information 
Is  Rent  to  every  register  In  the  memory  simultaneously.  This  has  Hie 
effect  of  setting  the  adjustments  of  all  the  Adaline3.  Orly  the  registers 
whose  classifiers  respond  properly  (e.g.,  give  +1  outputs)  answer  the 
request  and  transmit  their  information  back  to  tne  "central  control". 

Very  sophisticated  learning  procedures  would  become  possible  if  one 
hus  such  recall- by-nococlation  parallel-access  memory  systems.  The  sim¬ 
plicity  of  Adallne  and  the  progress  being  made  in  microelectronics  gives 
a  strong  indication  that  such  memory  systems  will  come  into  existence  in 
the  not  too  distant  future. 
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Chicago  17.  Illinois 

1  Attn:  Nlcnolan  C.  Metrop-olln,  uirector 

Gc-rgc  Washington  Ur.lverol’y 
Washington.  D.  C. 

»  attn:  Pr>f.  N.  Grlranr.re 

GcorMn  Institute  of  Technology 
i-  '•'■•a ,  Georgia 

1  Attr.:  Mr;  .  J.  H.  Cropland,  Librarian 

Kar/anl  University 
Technical  heportc  Collection 
Rjoa  ’OV,  Pierce  Ball 
Canbr'dgc  yUp  Massachusetts 
i  Attn:  M.  L  OiX,  Librarian 

Bnrv*»rd  University 
F'lerce  Ball  217 
Cambridge  jfl,  IkSiachu^et-. 

1  Attn:  ocnool  ->f  Applied  Science 
Dean  Harvey  Brooks 

Illinois  Institute  ox  Technology 
Technology  Center 
Chicago  16,  Illinois 

i  Attn*  Paul  C.  Yuen,  Research  Engineer 
Electronics  Research  laboratory 

University  of  Illinois 
Urtana,  Illinois 

1  Attn:  EE  Dept.,  Pruf .  H.  von  Fberster 
1  Alt*.:  Electron  Tube  Section 


University  of  Illinois 
Oontr.i  ;‘y  teas  laboratories 
Urbant,  I  1  Inula 
l  Attn:  Pro*.  Daniel  Alpert 

University  of  I i 1 lnois 
IVpartaen-  ol  Fuyslcs 
Urbann,  Illinois 
1  Attn:  Dr.  Johi  *w.rde«a 

Technical  Editor 

Electrical  Engineering  Researc;.  Tub.  ^20$) 
U&ivtn  lty  ^f  7 11  inals 

Urtni...  Illinois 
1  Attn:  Seohdisal  .BT.  tor 

Iowa  Stale  'hil  *eir-ity 
Physic-.  Depm rtaeu l 
Iwwa  '.ty,  Iowa 

1  Attn.  .TOf.  Fnw4)r  !%>cF'<ibld 

Johns  H.  pklnr.  Ur.lv'gslty 
Rniia*.  L;r  laboratory 
1JID  St .  Faul  Street 
Ba.Marjrc  2.  I*.rvland 
Attn:  Librarian 

Jonns  Hopkins  University* 

Ap.pl  «ei  Physics  laboratory 
^621  Georgia  Avenue 
1  Silver  Spring,  ‘•^rylar'' 

1  A :  :  ■  ...  Nagy 

Llnfleld  R^nearch  institute 
McMinnville,  Orrgc.* 

1  \t.tn:  Dr.  W.  p.  Director 

JVirquet^e  U .‘.Ivert  . 

2ep»  .  of  Electi  tci  zrcineerLng 
l  >1D  We:.t  'Jl-^hni'le  Avcmimv 
Milwaukee  *.  Wlscor  :ln 
1  Alt*.:  Hubert  Waie,  As*f.  Instructor 
1  Attn:  Th-aat.  Gabriele 

Mi:i.;af.hu'.etts  Instl  te  of  Technology 
Onsbrldge  39,  M*...  „ .  etts 
1  Penearch  laboratory  ’lectrcr.lcs 

1  Iab'.n*ory  fc-  Inrulr  .  *on  Research 
1  Ser/-..acchar.is«n  laboratory 

Lincoln  laboratory 

It;-- uohuoetts  Institute  of  Technology 
P.  0.  Bex  73 

Taxing*...-.  73,  Ifcssechunetts 

Navy  Reprer.er.*atlve.  **n>)ect  Lincoln 
Mkobschu.  ett  s  Institute  «*f  Te*:h:»ology 
Bldg.  B,  Liner  1*  1  laboratory 
P.  0.  bzx  73 

LtAlngton  7 a,  .Mi-sacou-er* 

DjMMalc  Aialywl.  .o«i  Coutrul  labcraiory 
Hisihichuastia  I'.dtktutr  r  T**cnr,3  logy 
Coabrldgc.  Kn.xacnucettb 
1  Attn.  D.  W.  Inuaann 

Director 

Electronic:.  lefense  Group 
Engineering  Terearch  Institute 
university  c  Michigan 
1  Air.  Arbor,  Michigan 

University  of  Michigan 
Engineering  Rc *^»rcb  Institute 
3D!  Fact  Engineering  Building 
Aar.  Arbor.  Michigan 

Attn:  Joseph  E.  to we.  Re.  .  Associate 

University  of  Michigan 

‘Joiiegv  cf  Lltcratore,  Science  end  Arts 

'an  Aru.r,  Michlgai. 

1  Attn:  A r char  W.  Durku,  Sept,  wf  P*liu.Mjp4ty 

University  of  Michigan 
Ann  Arbor,  Michigan 

1  Attn:  Dept,  of  Speech,  Director  Speech  Pen. 
iaiura'.uiy.  Go*  Ion  Peterson 

University  of  Michigan 
Willow  Run  laboratories 
fpsllantl,  Michigan 
1  Library 

University  of  Minnesota 
Depar  z-nt  of  Electrical  Engineering 
Institute  or  Technology 
Mlnnvnpjolls  Ik,  Mlnnei-ota 
1  Prof.  A.  Van  der  Zlel 
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Nev  York  ’ir.i-rrslty 
Usshl’urt.ou  Square 

new  Y-rh  3»  York 

1  Attn:  Dr.  H.  Koilao.-"' 

1  Attn:  Dr.  J.H.  Mulligan,  Jr. 

Chur  Loan  of  EE  Department 

Sortnvee'ern  University 
Evans*, on,  Illinois 
1  Attn:  Prof.  Donald  S.  Gage 

He  T* uiso :  og  1  .*al  Institute 

Northwestern  University 

Ik:  it  L-ljratory 

J  VI  Oak  ten  Street 
Evanston,  Illinois 
1  At* a:  Walter  3.  Toth 

Ohio  Stat*  University 
iVpartmrat  oi  Electr:-**  hrmnecring 
C„i  ran  us  10.  Ohio 
1  Attn:  **rof .  K.  M. 

Greff.  >n  State  Col  lege 

Department  o t  Electrical  Engineering 

Cor/al l is.  Creffor. 

1  Attn*  II.  J  Oorthvys 

Polytechnic  Instltvte  of  Brooklyn 
Hlcjwnve  Research  Institute 
55  Johnson  ".treet 
1  Brwj*lyn,  York 

Princeton  University 
Electrical  Engineering  Depa*’ ment 
Princeton,  Rev  Jersey 
1  Attn:  Prof.  F.  3.  Ac t an 

P*;M ue  University 
research  Library 
Lafayette,  Indian. 

1  Attn:  Electr  al  Eng.r.e«* .  4  Dept. 

Rensseise-  Polytechnic  Institute 
Office  of  the  Librsrl ai. 

1  Troy,  Rev  York 

University  of  Rocliestc  1 
Elver  Caepu*  Station 
Hochester  2C,  Rev  York 
1  Attn:  Dr.  Gerald  H  Cohen 

University  of  Sasxatcbewar 
College  of  engineering 
Ci-.kAtchevan,  C..nndn 
1  Attn:  Prof.  R.  F.  Moody 

Stanford  Research  Institute 
Mcr.lo  Parks,  California 
1  Attn  Documents  Center 

M.  L.  Fields ,  Acquisitions 

Syracuse  University 
Department  of  Electrical  Engineering 
Syracuse  lu.  Rev  York 
1  Attn:  Lr.  Herbert  Sclleranr. 

1  At.tr.:  Dr.  Stanford  Goldman 

'  Ur.  I  vert  I  tv  of  Utah 
Salt  Lane  City,  otan 
1  Attn.  P*  chard  W.  Grow 

V1«e»e»enl  b»at 

Way:*  State  University 
Dc’roit,  Michigan 
1  Attn:  Prof.  Harry  H.  Joiselson 
Dept,  of  Slavic  Languages 

Yale  University 
Rev  Haver.,  Connecticut 
l  Sluane  Physics  Laboratory 
1  Dept,  of  Electrical  Engineering 
1  Dunhaa  Laboratory 

University  of  Puerto  Rico 
College  of  Agriculture  and  Meehan ''■al  Arts 
Mryaguez,  Puerto  Hlco 
1  Attn:  Dr  Branlio  Dueno 

Sv'ss  Federal  Institute  of  Technology 
Glorias? r  35,  Zurich,  Switzerland 
1  Attn:  Prof.  M.  J.  'J.  Strutt 

Admiral  Corporation 
38OO  Cortland  Street 
Chicago  *7,  Illinois 
1  Attn:  Edith  H.  Hofcer*.«?.>, 

Librarian 

Airborne  Instnaants  Laboratory 
Comae  Road 
Deer  Park, 

L.  Rev  York 
1  Attn:  Joh it  Dyer 

vie*,  ^resident  A  "eeh.  Dir. 


bell  Tciepnone  la  turneries 
K. rrny  Kill  Ialora*o**lca 
Mir ray  Sill,  Rev  jersey 
1  Attn:  Dr.  G.  C.  Daccy 
1  Attn:  Dr.  J.  P.  Pierce 
1  Attu;  Dr.  S.  Darlington 
1  Attn:  A.  J  Grosnaar. 

1  Attn:  J.  A.  do m beck.  Director 
Electron  Tube  Development 

1  Attn:  Dr.  M.  fparkr. 

Beraon-Lehm*r  Corporation 
11930  Olympic  h  ulevnrd 
Angeles  oh,  California 

2  Att:.:  Bernard  Brnaon 

Ytiaac  Inbnratorlet  ,  Inc. 

Sales  Road 

Beverly,  Ms:.&acnuietts 
1  Attn:  Wellesley  J.  Dodd) 

Burrongfc..  Corporation 
Research  Center 
iaoll,  Per-i-yiv-nia 
1  At tn:  A.  J.  J^'.Loff 

Civ;  laboratories 
*♦»  11  .-n  Avenue 
1  Rev  York,  Res  York 

Soluabia  iknuatlon  Imucratoiy 
West  IPOtt.  Street 
1  Mev  York  27*  Rev  York 

Cornell  Aeronautical  L?bc . ,  Jnc. 

1  Attn:  D.  K.  Pluaeer 
i  Att^:  Cornell  Rv»*arch  Foundation 
biffalo  21.,  Mev  York 

Cornel'.  Aeronautical  laboratory 
Genere*  stree* 

Buf fair  21,  R«*w  YcrV 

1  Attu:  Syotesu.  Pea.  Dept.,  Dr.  F.  Rosenblatt 

Drexei  Institute  oY  Technology 
Dept,  of  F.lectrlcai  Engineer  in*:. 

Philadelphia  •»,  reunryivanlft 
1  Attn:  F.  B.  4ayneo 

El  tel -McCullough*  lac. 

Sa*.  **tui»,  California 
1  Atto:  .*•*  *  *  ,  Director 

Research  and  Development 

Fairchild  Semiconductor  Corporation 

■  •  CharlAetOO  Itoed 

lain  Alto,  California 
1  Attn:  Dr.  V.  h.  Grlnlch 

Federal  Tcleeoaaur.lc*,*  lens  u^.,  lac. 

500  Washington  Avenue 
Nut Icy,  Mev  Jersey 

1  Attn:  Librarian 

The  Ford  Foundation 
••77  fedlsor.  Avenue 
Rev  York  2?t  Mev  Tom 
1  Attn:  D- Forest  I-  Troutman 
Pruffraa  Associate 

Gi If 11  lan  Brother^ 
ldl5  Venice  Bouic.irl 
Ton  Angeles.  California 
1  Attn:  Countermeasux^a  Lab. 

General  Electric  Microwave  Lab. 

'Ol  California  Avenue 
Palo  Alto.  Cal  if  am ‘.a 
1  Attn:  Aldcn  Ryan 
1  Attn:  Technical  Library 

General  Electric  Company 
Research  Laboratory 
P.  0.  Box  1006 
Schenectady,  Rev  York 
1  Attn:  Alice  V.  Pell,  Librarian 

General  Electric  Company 
neiense  Electronics  Division 
Cornell  University 
1  Ithaca,  Nev  York 
VIA:  Cavnnder 

Wright  Air  Development  Division 
Wright- Pat  te-tur.  Air  Force  Base,  Ohio 
Attn:  WCLSL.h,  Donald  E.  Levis 

Genem  telephone  4  rlectromics  Lebe.  Inc. 
Bayclde  U>,  Rev  York 
*  Attn:  Louis  1  .  Blo  c 

The  Halllcrafters  Company 
5th  and  Kostner  Avenue 
1  Chicago  ?*»,  Illinois 


ueviett>rackurd  **  ~~ ~  y 
275  lage  Kill  hoan 
1  Ite’o  Alto,  California 

Hughes  Aircraft  Company 
Florence  at  Tenle  St. 

Cul.  r  City,  California 

l  Atto:  Technical  Library,  Eldg.  6,  Ha.  X201V 
1  Attn:  Sol !J -State  Group 

Hi.iicea  Products 
Intereatlorsl  Airport  Station 
Lee  Angeles  v,  California 
1  Attn:  <4.  C.  Long 

Ef-cn  'astern,  lac. 

7  Cambridge  Parkvsy 
Coaurldge  •*>.  Massachusetts 

!  Attn;  Dr.  J.  E.  DeTurk 

lydel,  luc. 

JiWi  Crescent  Street 

Waltham  MsssachuseH s 

1  Attn:  P.  fantazelos 

Internationa;  nc  ne*«  Machin«is 
Advan  .d  Systeas  Development  91,*. 

Spring  street 
Css  In log,  Rev  Yorz 
1  Attn*  J.  C.  Loguc 

International  Business  Machines 
Research  Center 
r  oughkeepeie,  Mev  York 
I  Attn:  Libra rv 

T**emstional  Bualneas  Machines 
Product  Development  Laboratory 
Poighkeepsie,  Nev  York 
1  At:  n:  h.  M.  Da/ls 

Interuationai  Buaiaass  Machines 
research  Library 
1  Ba;.  Joae,  Cslifoinia 

Jet  Propulsion  laboratory 
Cslifomls  Institute  of  Terhology 
H0OO  Oak  Qr<r/e  Drive, 

•  Pasadena  3»  Colifomla 
1  Attn:  I.  t.  Nevlar,  %r. 

Tech.  Reports  Sac. 

labr.-atrtrv  for  F.le<  tronic;..  In*"  - 
1C~>  Cccsor.vcolth  /.vs. 

Poster  id,  Massachusetts 
1  Attn  Dr.  H.  Fuller 

Lenkurt  Electric  Company 
San  Carlos,  California 
1  Attn:  Circuit  Researci.  Dept. 

1  Attn:  M.  L.  Waller,  Librarian 

LFL,  luc. 

•60  Oak  St. 

Cop '.ague  L.  1.,  Dew  York 
l  Attn:  Robert  3.  Msutner 

Llbnacope,  Inc. 

80S  Western  Avenue 
Glendale  1,  Cai'fomli. 

1  Attn:  Engineering  Services  library 

Lockheed  Aircraft  Corporation 
Missile  Systeas  Division 
Sunnyvale,  California 
1  Attn:  J.  P.  Rash 
1  Attn:  H.  Lelfrr 

Lockheed  Aircraft  Corpora*,  io.. 

Missile  and  Spare  Division 
Dept.  33#  Blig.  10* 

P.  0.  Box  5C*» 

Sunnyvale.  Califor"*a 
1  Attn:  Subsystem  F,  P.  D.  Doarssn 

Mn.  quardt  Aircraft  Company 
16555  Saticoy  Street 
P.  0.  Box  2013  -  S~*th  Annex 
Van  Nu>s.  California 

I  Attn:  Dr.  Bosun  Chang,  Research  Scientist 

Nelpar.  Inc. 

Ajpiied  science  uiv. 

11  Galen  St. 

Watertovn  fr,  Massachusetts 
1  Attn:  Librarian 

Mirrovave  Electronics  Corporation 
h(>6l  Transport  3t. 

Palo  Alto,  California 
1  »ttn:  Stanley  F.  Kaisel.  President 
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K.nr  *vjpol»  a -Honeywell  Regulator  Co. 
Trn.isletor  Division 
27V3  ath  Avr—ue  South 
1  Minneapolis  8,  Minnesota 

XXTKE  Corporation 

2**  Wool  Ct r*e« 

.  Lexington  73,  Massachusetts 

Moonanto  Chealcnl  Co. 

«>,  No.  Lindbergh  BI/o. 

St.  Lou!  *  6»j,  Ho. 

1  Attn:  Hi.  I idvmf  Orhan,  Mgr. 

Inorganic  De.eloiavnt 

-4»i  .rulu,  Inc . 
tl;  *C  Indl«Ti<*  Avenue 
River:  Me.  California 
.  Af-n:  K.  F.  Freeze 

TccLnUol  Informal  ion  Analyst 

ilortror.I  a  Te-’hnicil  Infortutiwr.  Agency 
Org'nlmtlgr  2dO0/k> 

2£?  Korin  iTnirie 
ils.toorne,  California 
.  Act:.:  At*,  itwic  /u«» "« 

inciflc  Jem!  onductora,  Inc. 

10  s**  i  West  Jeffcrron  Blvd. 

Culver  City,  California 
1  Attn:  H.  Q.  North 

Hi'l-’o  Coir-jratlw. 

•A7*  Fabian 
Palo  Alta,  Call  for:  .a 
l  Attn:  Technical  Library 

V.'.lco  Corporation 
2nd  utrw  ».-i  A.*..—*. 

Riilitdelpbln  ,2,  Pennsylvania 
-.ttn:  F.  R.  ohermn,  Tech.  Editor 
V*1.  ®ej*.  r  “  P*ill. 

tvlico  Corporation 
•Vlvielphia  -- »  renn«>  1/ar.l* 

1  .ttn:  W.  T.  .'•usaerl.n.  Chairman 

i’h.  i c  ■>  C0.7v.rat Ion 
riegt*  i-l  *  •  '* * 

S'hiinde'phin.  Pennsylvania 
l  Attn:  w.  K.  Bnui.e) 

Director  of  Research 

n-.l're  Lai  orator lea 
Irvlngt  on-o:  -Hudson 
New  York 

1  Attn:  Gilbert  Kelt on 

Security  officer 

Polamd  Elect ronlca  Carp. 

*3-20  Thirty-  Fourth  Street 

l.^rg  ilit-'vl  Pity  1(  ti*v  York 

1  Atti:  A.  H.  Soeaescbein 

Chief  Systems  Engineer 

Radio  Corporation  of  Aaeri-a 
Cooden,  Hew  Jersey 
1  Attn:  C.  H.  Ryeraon 

RCA  Laboratories 
Princeton*  New  Jersey 
l  Attn:  Harwell  John* on 
1  Attn:  Dr.  W.  M.  raster 

Radio  Corpon  t'on  of  Acer  Ice 
Department  609 
Somerville,  Mew  Jeraey 
1  Attn:  R.  Bharat 

Tie  fond  Corporation 
1700  Main  Street 
Santa  Monica,  California 
1  Attn:  Mrrgaret  Anderaon,  Librarian 

ttuy+heon  Corponitlon 
Waltham,  Massachusetts 
1  Atta:  F..  tx.  HcGeUigon 

Raytheon  Manu:actu"lhf  Company 
Reaearch  Dlvi»*an 
Uaithaa,  Maaaacbuaetta 
1  At.tn:  Dr.  Ferann  Rtat* 
l  Attn:  Librarian 

Raytheon  Cjapany 

Microwave  and  Power  fobs  Division 
Spencer  Laboratory 
Burling'  i.  Maaaacbuaetta 
1  Attn:  Librarian 

Remington  Rand  Unlve. 

Division  of  Sperry  Rand 
1900  V.  Allegheny  Avenue 
Rilladelphla  29*  Pennsylvania 

5  t Lin:  Di.  J.  Maurhly 

SJ  * 


Roger  White  Electron  Devices,  Inc. 

Tall  rwkw  w 
iviufc  i  Ledges 
1  .'tonfonl,  ConneMcut 

Sand  la  Corporation 

Sard! a  Boar 

Albuquerque,  New  Mexico 
1  Attn:  Hr*.  V.  K.  Alien,  L  -rarlan 

Smyth  Research  Associates 
3V30  “th  Avenue 
San  D'ego  3,  California 
1  Attn:  Dr.  J.  P-  Smyth 

Technical  Director 

Space  Technol&ar  Labs.,  Inc. 

Physical  Research  Lab. 
uoa  a:«vs.v8  •»;,  uaiilornia 
1  Attn:  Dr.  Ml' ton  Clauaer,  P.  0.  Bai*  r*5CC2 
1  Attn:  Inf om »t l on  Cenrlees  Acquisition 
P.  C  Box  9*3001 

Sperry  Gyros-  ope  Caapany 
Engineering  'brnrv 
<  >  station  C  *9 
Great  Ke<*x,  ,.  I.,  New  York 
1  Attn:  K.  Barr.ej 

Engineering  Librarian 

Sperry  Electro*.  Tube  D. vision 
Cperiy  F«*rsl  Corporation 
1  Galnesv.'lle,  Florida 

Sperry  Microwave  Electronics  Company 

Clen*vr«.er,  Florida 

l  Attn:  .•'.nan  *>werT,  F:.g.  ^ectjv*  H-od 

Ipplled  Physics  and  Kiernweve 
Solid  State  Devices 

1  Attn:  John  E.  Pippin,  Sr.  Staff  Engineer 

Sylvantn  Fleet  ror  lea  Systm 
'Walthea  Laboiatwi  >* 

.00  Flint  Avenue 
Waltham  Msss’k  lusetts 
1  Attn:  Librarian 

>' -amending  Offlc-u 

U.  *?.  Army  meet  \»ni-*e  Research  Unit 
(Sylvar i ») 

1  Mountain  View,  1  ul  f^rnla 

Techrjcel  Heeea  -  h  Grr*up 
1  Syosset,  L.  X.,  Rev  Y— b 

Tektronix,  Inc. 
p.  r.  Box  a;i 
Portland  7,  Oregon 
-•  Attn:  Engineer . library 

Tv***  1  ns t rumen M.  Inc. 
bOOG  l*m wtot.  Aver  u' 
in  lias  9,  Texas 
1  Attn:  Library 
1  Attn:  Dr.  R.  L.  *rltehard 

l  Attr.:  S^mlcond".  nr  Cumpona«ta  Library 

Texas  Instruments  Inc. 

Central  Research  and  Engineering 
Techn.cai  Xr.tomr.lon  Services 
I .  0.  Box  irT9 
1  uatlai.  21,  Texas 

Tetas  Technoiog it’ll  College 
Lubbock,  Texas 
1  Attn:  Paul  0.  Griffith 
Dspartaen’  of  EE 

Trans  it  run  Elect -«lc  Corporation 
1»V3.  ift?  Albion  S  rest 
Wakefield,  Mosea  husstts 
1  Attn:  Dr.  a.  0.  Rudenberg 
Director,  R.  and  D. 

V-\-ian  Asaorlatet 
Oil  Konaen  Usf 
Palo  Alto.  California 
1  Attn:  Technical  Library 

Meetinghouse  Electric  Corjv 
Mctuchen,  lew  Jersey 
1  Attn:  M.  J.  Bells  tram,  Supervisor 
Advanced  Development  Lab. 

Westirghouac  Electric  Corporation 
Research  Labora'orlct 

Beulah  Hond,Chur..htIl  ftoro 
Pittsburgh  3D,  Pa. 

1  Attn:  J.  0.  Castle,  Jr. 
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Veat  inghouae  Ele»  ir:  •  c  ! 

Fr.Sndehlp  T--.eni»tlir.s  *,• 

Box  7*^»  Balt  mote  %  .M 

1  Attn:  G.  Vo%a  K-  Igors.  • 
Appllei 

BsLtlaa.ia  A>t  .  ■ 

! 
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